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Foreword 



This Technical Specification (TS) has been produced by ETSI Technical Committee Satellite Earth Stations and 
Systems (SES). 

The contents of the present document are subject to continuing work within TC-SES and may change following formal 
TC-SES approval. Should TC-SES modify the contents of the present document it will then be republished by ETSI 
with an identifying change of release date and an increase in version number as follows: 

Version l.m.n 

where: 

• the third digit (n) is incremented when editorial only changes have been incorporated in the specification; 

• the second digit (m) is incremented for all other types of changes, i.e. technical enhancements, corrections, 
updates, etc. 

The present document is part 6, sub-part 1 of a multi-part deliverable covering the GEO-Mobile Radio Interface 
Specifications, as identified below: 

Parti: "General specifications"; 

Part 2: "Service specifications"; 

Part 3: "Network specifications"; 

Part 4: "Radio interface protocol specifications"; 

Part 5: "Radio interface physical layer specifications"; 

Part 6: "Speech coding specifications"; 

Sub-part 1: "Speech Processing Functions; GMR-1 06.001"; 

Sub-part 2: "Vocoder: Speech Transcoding; GMR-1 06.010"; 

Sub-part 3: "Vocoder: Substitution and Muting of Lost Frames; GMR-1 06.011"; 

Sub-part 4: "Vocoder: Comfort Noise Aspects; GMR-1 06.012"; 

Sub-part 5: "Vocoder: Discontinuous Transmission (DTX); GMR-1 06.031"; 

Sub-part 6: "Vocoder: Voice Activity Detection (VAD); GMR-1 06.032"; 
Part 7: "Terminal adaptor specifications". 
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Introduction 



GMR stands for GEO (Geostationary Earth Orbit) Mobile Radio interface, which is used for mobile satellite services 
(MSS) utilizing geostationary satellite(s). GMR is derived from the terrestrial digital cellular standard GSM and 
supports access to GSM core networks. 

Due to the differences between terrestrial and satellite channels, some modifications to the GSM standard are necessary. 
Some GSM specifications are directly applicable, whereas others are applicable with modifications. Similarly, some 
GSM specifications do not apply, while some GMR specifications have no corresponding GSM specification. 

Since GMR is derived from GSM, the organization of the GMR specifications closely follows that of GSM. The GMR 
numbers have been designed to correspond to the GSM numbering system. All GMR specifications are allocated a 
unique GMR number as follows: 

GMR-n xx.zyy 

where: 

xx.Oyy (z=0) is used for GMR specifications that have a corresponding GSM specification. In this case, the 
numbers xx and yy correspond to the GSM numbering scheme. 

xx.2yy (z=2) is used for GMR specifications that do not correspond to a GSM specification. In this case, only the 
number xx corresponds to the GSM numbering scheme and the number yy is allocated by GMR. 

n denotes the first (n=l) or second (n=2) family of GMR specifications. 

A GMR system is defined by the combination of a family of GMR specifications and GSM specifications as follows: 

• If a GMR specification exists it takes precedence over the corresponding GSM specification (if any). This 
precedence rule applies to any references in the corresponding GSM specifications. 

NOTE: Any references to GSM specifications within the GMR specifications are not subject to this precedence 
rule. For example, a GMR specification may contain specific references to the corresponding GSM 
specification. 

• If a GMR specification does not exist, the corresponding GSM specification may or may not apply. The 
applicability of the GSM specifications is defined in GMR-1 01.201 [10]. 
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1 Scope 



The present document is an introduction to the GMR-1 06-series of technical specifications, which deal with the speech 
processing systems functions in the GMR-1 System. A general overview of each speech processing function is given 
with reference to the technical specification where each part is specified in detail. 
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3 Definitions and abbreviations 

3.1 Definitions 

For the purposes of the present document, the following terms and definitions apply: 

Voice Activity Detection (VAD): method of classifying short segments of speech as either "voice" or "background 
noise." The decision is based upon comparing the current level and spectral characteristics of the input signal with that 
of a typical level and spectral characteristics 

Comfort Noise Insertion (CNI): method of synthesizing low-level noise on the receive side during breaks in voice 
transmission. To increase the perceived voice quality, the synthesized noise has characteristics that are similar to the 
background noise present on the transmit side 

Forward Error Correction (FEC): method of introducing redundancy to binary data that allows for the detection 
and/or correction of errors introduced during transmission of that data 

V/UV( Voiced/Unvoiced): each spectral band is declared either "voiced" or "unvoiced", depending upon the amount of 
periodic energy in that band. This voicing decision is frequently referred to as a V/UV decision 

frame: data representing a full 40 msec of continuous data input to or output from the vocoder. The frame data may 
consist of model parameters, quantized bits, FEC encoded channel data, or speech samples at various points in the 
vocoder 

subframe: data representing 10 msec of continuous data input to or output from the vocoder, or the result of processing 
that data through various points in the vocoder. For example, "The second subframe of model parameters is passed to 
the quantizer" is a valid use of the term as is "The decoder outputs one subframe of 8 kHz speech samples" 

subframe number: each frame is composed of four consecutive subframes that are each assigned a subframe number. 
The first, second, third, and fourth subframes within a frame are assigned subframe numbers 0, 1,2, and 3 respectively 

quantizer-frame: data representing the 20 msec of continuous vocoder data that is formed by combining subframes 
and 1 or subframes 2 and 3 

quantizer-frame number: each frame is composed of two consecutive quantizer-frames that are each assigned a 
quantizer frame number. The first and second quantizer-frames within a frame are assigned quantizer-frame numbers 
and 1 respectively 

voice frame: 40-msec frame that contains some voice data but no tone data. It may also contain comfort noise data 

SID frame: (Silence Descriptor): 20-msec frame that contains only comfort noise data. No voice or tone data may be 
present in a SID frame 

tone frame: 40-msec frame that contains tone data. It may also contain voice data or comfort noise data 

3.2 Abbreviations 

Abbreviations used in the present document are listed in GMR-1 01.004 [1]. 



4 Introduction 

The speech processing functions in the GMR-1 system include the following. 

Speech transcoding, which includes a speech encoder that converts digitized speech samples into a compressed binary 
bit stream and a speech decoder that converts a compressed binary bit stream into digital speech samples. 

Discontinuous transmission (DTX), which is used to reduce the transmission rate during periods of voice inactivity. 

VAD, which is used to identify periods of voice activity, as required by DTX. 
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CNI, which is used to convey the characteristics of the background noise from the transmit end to the receive end of the 
connection, in an effort to reduce the modulation of background noise that would otherwise occur with DTX. 

Lost speech frame substitution and muting, which is used to mask transmission errors and stolen frames. 

Detection and regeneration of single-frequency and dual-tone multifrequency (DTMF) signals. 

All of the above functions are integrated into the GMR-1 5,2 kbps Version 1 vocoder [9]. 



5 Speech transcoding 

Speech transcoding is described in GMR-1 06.010 [4]. 

The speech encoder takes 16-bit uniform pulse code modulation (PCM) samples as input. One frame of encoded speech 
consists of two quantizer-frames, which each contain 80 bits. The encoded speech at the output of the voice encoder is 
delivered to the channel coding function defined in GMR-1 05.003 [2] to produce an encoded frame consisting of two 
quantizer-frames, each containing 104 bits. The vocoder therefore produces a gross bit rate of 5,2 kbps where 4,0 kbps 
are used for voice data and the remaining 1,2 kbps are used for error control. 

In the receive direction, the inverse operation takes place. 

The GMR-1 system utilizes a 40-msec frame size and employs the vocoder. The vocoder accepts 160 ± 8 samples, at a 
sampling rate of 8 000 samples per second, in each 20-msec frame. The voice encoder processes the samples in 80 ± 4 
sample segments. These 10 ms segments are called subframes, and each 20-msec frame is divided into two subframes. 
The voice encoder must process two successive 10-msec subframes before it outputs any encoded data. The encoded 
data is output at 20-msec intervals, called quantizer-frames. 



6 Discontinuous transmission (DTX) 

Discontinuous transmission is described in GMR-1 06.031 [7]. 

During a normal conversation, the talkers alternate so that, on average, each direction of transmission is occupied about 
50 % of the time. Discontinuous transmission (DTX) is a mode of operation where the transmitters are switched on only 
for those frames that contain useful information, which may be done for the following purposes: 

• in a mobile earth station the battery life would be prolonged or a smaller battery could be used for a given 
operational duration, due to decreased power requirements; 

• to reduce the average interference level on the "air," leading to better system performance; 

• to provide better link margin in the system; 

• the overall DTX mechanism is implemented in the TX and RX DTX handlers described in GMR-1 06.031 [7] 
and requires the following functions, which are described in separate technical specifications; 

• a VAD on the transmit side; 

• evaluation of the background acoustic noise on the transmit side, in order to transmit characteristic parameters to 
the receive side; 

• generation on the receive side of a similar noise, called comfort noise, during periods where the radio 
transmission is cut. 

The above functions are all integral to the vocoder. 

The transmission of comfort noise information to the receive side is achieved by means of a special SID frame. A SID 
frame is a 104-bit quantizer-frame representing a 20-msec segment that contains no voice activity. A SID frame is 
contained within the final 40-msec frame of a voice burst, and serves as an end of voice marker at the receive side. In 
order to update the comfort noise characteristics at the receive side, SID frames are continuously transmitted during 
periods of voice inactivity but at a much lower data rate (100 bps). The actual transmission aspects of the SID frame on 
the radio link are described in GMR-1 05.008 [3]. 
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For the overall DTX functionality, the DTX handlers use various flags to interface with the radio subsystem, which is in 
control of the transmitter keying on the TX side and performs preprocessing functions on the RX side, as also described 
in GMR-1 06.031 [7]. 



7 Voice activity detection (VAD) 

Voice activity detection (VAD) is described in GMR-1 06.032 [8]. 

The VAD is an integral part of the voice encoder described in GMR-1 06.010 [4]. The VAD is used to determine which 
frames contain voice activity and which frames contain only background noise. The VAD outputs a flag that is passed 
to the TX DTX handler. 



8 Comfort noise aspects 

Comfort noise aspects are described in GMR-1 06.012 [6]. 

When switching the transmission on and off during DTX operation, the effect would be a modulation of the background 
noise if no precautions were taken. When transmission is on, the background noise is transmitted together with the 
speech to the receiving end. As the speech burst ends, the connection is off and the perceived noise would drop to a 
very low level. This step modulation of noise is perceived as very annoying and may reduce the intelligibility of speech 
if presented to the listener without modification. 

This noise contrast effect is reduced in the GMR-1 system by inserting an artificial noise, called comfort noise, at the 
receiving end when speech is absent. 

The CNI functions are integral to the vocoder as described in GMR-1 06.010 [4]. The CNI functions handle the 
following CNI tasks: 

the evaluation of the spectral characteristics of the background noise at the transmitter; 

encoding and decoding the noise information in SID frames; 

generation of similar noise at the receive end. 



9 Lost speech frame substitution and muting 

Lost speech frame substitution and muting is described in GMR-1 06.01 1 [5]. 

In the receiver, frames may be lost due to transmission errors or frame stealing. GMR-1 06.01 1 [5] describes the actions 
to be taken in these cases, both for lost speech frames and lost SID frames during DTX operation. 

In order to hide the effects of an isolated lost frame, a predicted frame that is based upon previously received frames 
replaces the lost frame. When multiple consecutive frames are lost, muting is employed in order to indicate to the user 
at the receiving end that the transmission is interrupted. 
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